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Abstract. We take on a Random Matrix theory viewpoint to study the spectrum of 
certain reversible Markov chains in random environment. As the number of states tends 
to infinity, we consider the global behavior of the spectrum, and the local behavior at the 
edge, including the so called spectral gap. Results are obtained for two simple models 
with distinct limiting features. The first model is built on the complete graph while the 
second is a birth-and-death dynamics. Both models give rise to random matrices with 
non independent entries. 



1. Introduction 

The spectral analysis of large dimensional random matrices is a very active domain of 
research, connected to a remarkable number of areas of Mathematics, see e.g. |27[ |22| [3j 
[TO] [TJ [37]. On the other hand, it is well known that the spectrum of reversible Markov 
chains provides useful information on their trend to equilibrium, see e.g. [31 ] [15 ] 129] [25] . 
The aim of this paper is to explore potentially fruitful links between the Random Matrix 
and the Markov Chains literature, by studying the spectrum of reversible Markov chains 
with large finite state space in a frozen random environment. The latter is obtained by 
assigning random weights to the edges of a finite graph. This approach raises a collection 
of stimulating problems, lying at the interface between Random Matrix theory, Random 
Walks in Random Environment, and Random Graphs. We focus here on two elementary 
models with totally different scalings and limiting objects: a complete graph model and 
a chain graph model. The study of spectral aspects of random Markov chains or random 
walks in random environment is not new, see for instance |18[ [91 [39[ [T3] [T3l [TT] [M] and 
references therein. Here we adopt a Random Matrix theory point of view. 

Consider a finite connected undirected graph G = (V,E), with vertex set V and edge 
set E, together with a set of weights, given by nonnegative random variables 

U = {U id ;{i,j}eE}. 

Since the graph G is undirected we set Uij = Uj^. On the network (G, U), we consider 
the random walk in random environment with state space V and transition probabilities 

(1) Kij = — where pi = ^ U itj . 

''' i--{i,j}eE 

The Markov kernel K is reversible with respect to the measure p = {pi , i £ V} in that 

PiKij — pjKji 

for all i, j £ V. When the variables U are all equal to a positive constant this is just the 
standard simple random walk on G, and K — I is the associated Laplacian. If pi = for 
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some vertex zq then we set K{ 0t j = for all j / io and iQ 0i i = 1 (zq is then an isolated 
vertex) . 

The construction of reversible Markov kernels from graphs with weighted edges as in 
([T]) is classical in the Markovian literature, see e.g. [151 119j. As for the choice of the graph 
G, we shall work with the simplest cases, namely the complete graph or a one-dimensional 
chain graph. Before passing to the precise description of models and results, let us briefly 
recall some broad facts. 

By labeling the n = \V\ vertices of G and putting K^ a = if {i,j} E, one has that 
K is a random n x n Markov matrix. The entries of K belong to [0, 1] and each row sums 
up to 1. The spectrum of K does not depend on the way we label V . In general, even if 
the random weights U are i.i.d. the random matrix K has non-independent entries due to 
the normalizing sums pi. Note that K is in general non-symmetric, but by reversibility, 
it is symmetric w.r.t. the scalar product induced by p, and its spectrum o~(K) is real. 
Moreover, 1 £ a(K) C [— 1, +1], and it is convenient to denote the eigenvalues of K by 

-1 < K{K) < < Ai (if) = 1. 

If the weights U{j are all positive, then if is irreducible, the eigenspace of the largest 
eigenvalue 1 is one-dimensional and thus A 2 (if ) < 1. In this case pi is its unique invariant 
distribution, up to normalization. Moreover, since if is reversible, the period of if is 1 
(aperiodic case) or 2, and this last case is equivalent to A n (if) = —1 (the spectrum of if 
is in fact symmetric when if has period 2); see e.g. |32j . 

The bulk behavior of a{K ) is studied via the Empirical Spectral Distribution (ESD) 

1 n 

k=i 

Since if is Markov, its ESD contains probabilistic information on the corresponding ran- 
dom walk. Namely, the moments of the ESD px satisfy, for any I E Z + 

(2) f +1 x e p K (dx) = -Tr(K e ) = - ^ r/ u (i) 

where r^(i) denotes the probability that the random walk on (G, U) started at i returns 
to i after t steps. 

The edge behavior of a{K ) corresponds to the extreme eigenvalues A2(if ) and A n (if ), or 
more generally, to the /c-extreme eigenvalues A2(if ), ■ ■ ■ , Xk+i(K ) and A n (if ), . . . , X n -k+i{K ) 
The geometric decay to the equilibrium measure p of the continuous time random walk 
with semigroup (e*^ i! ' - ^) t>0 generated by K — I is governed by the so called spectral gap 

gap(if-i) = l-A 2 (if). 

In the aperiodic case, the relevant quantity for the discrete time random walk with kernel 
K is 

dK) = l- max |A| = l-max(-A n (if),A 2 (if)). 

Xeer(K) 

In that case, for any fixed value of n, we have (K — > p as i — > oo, for every 1 < i < n. 
We refer to e.g. [3U [25] for more details. 

Complete graph model. Here we set V = {1, . . . ,n} and E = £ V}. Note 

that we have a loop at any vertex. The weights f/j,-, 1 < i < j < n are i.i.d. random 
variables with common law C supported on [0, oo). The law C is independent of n. Without 
loss of generality, we assume that the marks U come from the truncation of a single infinite 
triangular array (Uij)i<i<j of i.i.d. random variables of law C This defines a common 
probability space, which is convenient for almost sure convergence as n — > oo. 
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When C has finite mean J* °° xC(dx) = m we set m = 1. This is no loss of generality 
since K is invariant under the linear scaling t —> t Uij . If C has a finite second moment 
we write a 2 = f£°(x — l) 2 C{dx) for the variance. The rows of K are equally distributed 
(but not independent) and follow an exchangeable law on K™. Since each row sums up to 
one, we get by exchangeability that for every 1 < i,j / j' < n, 

E(Ki A = - and CoviK h K { v ) = — Var(Ki i). 

n ^ u n _ i 

Note that C may have an atom at 0, i.e. P(J7i,j = 0) = 1 — p, for some p G (0, 1). In 
this case K describes a random walk on a weighted version of the standard Erdos-Renyi 
G(n,p) random graph. Since p is fixed, almost surely (for n large enough) there is no 
isolated vertex, the row-sums pi are all positive, and K is irreducible. 

The following theorem states that if C has finite positive variance < a" < oo, then 
the bulk of the spectrum of y/nK behaves as if we had a Wigner matrix with i.i.d. entries, 
i.e. as if pi = n. We refer to e.g. [3j [1] for more on Wigner matrices and the semi-circle 
law. The ESD of y/h~K is p^ K = ± ££=i ^ Xk(K) . 

Theorem 1.1 (Bulk behavior). If C has finite positive variance < a 2 < oo then 

v n— >oo 

almost surely, where stands for weak convergence of probability measures and W2 CT is 
Wigner 's semi-circle law with Lebesgue density 

(3) X ' y ^ ^ 4(j2 ~ X<1 1 i-2a,+2a] (x) . 

The proof of Theorem 11.11 given in Section [21 relies on a uniform strong law of large 
numbers which allows to estimate pi = n(l + o(l)) and therefore yields a comparison of 
y/nK with a suitable Wigner matrix with i.i.d. entries. Note that, even though 

(4) Xi(\/nK) = y/n — > oo as n — > oo, 

the weak limit of is n °t affected since Ai (y/nK) has weight 1/n in p^x- Theorem 

ll.ll implies that the bulk of o~(K) collapses weakly at speed n~ 1//2 . Concerning the extremal 
eigenvalues X n (y/nK) and X^^y/nK), we only get from Theorem 11.11 that almost surely, 
for every fixed k 6 Z+, 

liminf y/nX n _k(K) < —2a and lim sup v^Afc^ (K) > +2o". 

The result below gives the behavior of the extremal eigenvalues under the assumption that 
C has finite fourth moment (i.e. E([/^ 1 ) < oo). 

Theorem 1.2 (Edge behavior). If C has finite positive variance < a 2 < oo and finite 
fourth moment then almost surely, for any fixed k 6 Z +; 

lim y/nX n -k{K) = —2a and lim y/nXk^iK) = +2a. 

n— <rco n— >oo 

7n particular, almost surely, 

(5) g a P (ir-7) = l-^ +0 ( 1) and = 1 - * + „ ( » 



The proof of Theorem 11.21 given in Section [51 relies on a suitable rank one reduction 
which allows us to compare X2(y/nK) with the largest eigenvalue of a Wigner matrix with 
centered entries. This approach also requires a refined version of the uniform law of large 
numbers used in the proof of Theorem II .11 

The edge behavior of Theorem 11.21 allows one to reinforce Theorem 11.11 by providing 
convergence of moments. Recall that for any integer p > 1, the weak convergence together 
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with the convergence of moments up to order p is equivalent to the convergence in Wasser- 
stein W p distance, see e.g. [36]. For every real p > 1, the Wasserstein distance W p (fj>,v) 
between two probability measures on! is defined by 

(6) W p (ji, u) = inf ( [ \x-y\PU(dx,dy)) ^ 

n VJRxR / 

where the infimum runs over the convex set of probability measures on E 2 = E x R with 
marginals /i and v. Let Ji^k De the trimmed ESD defined by 

i n i 
~ - 1 \^ x n 1 x 

k=2 

We have then the following Corollary of theorems 11.11 and II. 2| proved in Section [2j 

Corollary 1.3 (Strong convergence). If C has positive variance and finite fourth moment 
then almost surely, for every p>l, 



lim W p (Jl jE K ,yV2a) = while lim W p (u /Hi^VW) 



if p < 2 

1 ifp = 2 
oo if p > 2. 



Recall that for every k £ Z+, the k th moment of the semi-circle law W20- is zero if k 
is odd and is a k times the (A;/2) th Catalan number if k is even. The r th Catalan number 
^tj ( r r ) counts, among other things, the number of non-negative simple paths of length 
2r that start and end at 0. 

On the other hand, from ([2]), we know that for every k £ Z+, the k th moment of the 
ESD p-^K writes 

r 1 n 

/ I ^ M (,b) = -'ftta») =n- 1+ f ^r fc u (i). 

Additionally, from ([5]) we get 

where Jj-^x ^ s the trimmed ESD defined earlier. We can then state the following. 

Corollary 1.4 (Return probabilities). Let rV(i) be the probability that the random walk 
on V with kernel K started at i returns to i after k steps. If C has variance < a 2 < 00 
and finite fourth moment then almost surely, for every k £ Z+, 



(7) lim „-*+* y>F(o-i 



?1— >00 




if k is odd 

fcTsJpiGfc/a) if k is even. 



We end our analysis of the complete graph model with the behavior of the invariant 
probability distribution p oi K, obtained by normalizing the invariant vector as 

P=(j>l-\ h Pn) _1 (Pl<*l H h |»rA)- 

Let U = n~ l (8\ + • • • + 6 n ) denote the uniform law on {1, . . . , n}. As usual, the total 
variation distance — v\\ T v between two probability measures \i = J2k=i l^k^k and v = 
YJk=i v kh on {1, . . . , n} is given by 



1 n 

It, 



k=i 
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Proposition 1.5 (Invariant probability measure). If C has finite second moment, then 



a.s. 



(8) 



lim 



U\ 



0. 



The proof of Proposition 11.51 given in Section [2j relies as before on a uniform law of 
large numbers. The speed of convergence and fluctuation of \\p — U\\ TV depends on the tail 
of C. The reader can find in Lemma 12.31 of Section [2] some estimates in this direction. 



Chain graph model (birth-and-death). The complete graph model discussed earlier 
provides a random reversible Markov kernel which is irreducible and aperiodic. One of the 
key feature of this model lies in the fact that the degree of each vertex is n, which goes to 
infinity as n — > oo. This property allows one to use a law of large numbers to control the 
normalization p{. The method will roughly still work if we replace the complete graphs 
sequence by a sequence of graphs for which the degrees are of order n. See e.g. [37] for a 
survey of related results in the context of random graphs. To go beyond this framework, it 
is natural to consider local models for which the degrees are uniformly bounded. We shall 
focus on a simple birth-and-death Markov kernel K = {Ki,j)i<i,j<n on {1, . . . , n} given by 

— Ki t i — Oj, Ki,i—1 — Cj 

where (ai)i<i< n , {h)i<i<n, {ci)i<i< n are in [0, 1] with a = b n = , bi + a; + c* = 1 for 
every 1 < i < n, and q+i > and bi > for every 1 < i < n — 1. In other words, we have 

(a x b x \ 

c 2 a 2 b 2 

C3 <23 h 



(9) 



K 



\ 



b n -l 
a n ) 

The kernel K is irreducible, reversible, and every vertex has degree < 3. For an arbitrary 
pi > 0, the measure p = p\5\ + ■ ■ ■ + p n 5 n defined for every 2 < i < n by 

nbk h ■ ■ ■ bi-i 
= pi- 



k=l 



Cfc+1 



C2 • • • Q 



is invariant and reversible for K, i.e. for 1 < i,j < n, PiKi j = pjKj^. For every 1 < i < n, 
the i th row (q, aj, bi) of K belongs to the 3-dimensional simplex 

A 3 = {v£ [0,l] 3 ; Vl +v 2 + v 3 = l}. 

For every w £ A3, we define the left and right "reflections" G A3 and u+ G A3 of v by 

= (vi + v 3 ,v 2 ,0) and v + = (0, v 2 , vi + v 3 ). 

The following result provides a general answer for the behavior of the bulk. 

Theorem 1.6 (Global behavior for ergodic environment). Let p : Z — > A3 be an ergodic 
random field. Let K be the random birth-and-death kernel ([9]) on {1, . . . , n} obtained from 
p by taking for every 1 < i < n 

r p(i) if2<i<n-l 
(ci,ai,bi) = < p(l) + ifi = l 
k p(n)_ ifi = n. 

Then there exists a non-random probability measure p on [—1, +1] such that almost surely, 

lim Wp(p K ,p) = 
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for every p>l, where W p is the Wasserstein distance ([6]). Moreover, for every £ > 0, 



-i 



afti(dx) = E[rf(0)] 

where r?(0) is the probability of return to in t steps for the random walk on Z with 
random environment p. The expectation is taken with respect to the environment p. 

The proof of Theorem II. 6| given in Section [3l is a simple consequence of the ergodic 
theorem; see also [§] for an earlier application to random conductance models. The reflec- 
tive boundary condition is not necessary for this result on the bulk of the spectrum, and 
essentially any boundary condition (e.g. Dirichlet or periodic) produces the same limiting 
law, with essentially the same proof. Moreover, this result is not limited to the one- 
dimensional random walks and it remains valid e.g. for any finite range reversible random 
walk with ergodic random environment on Tj d . However, as we shall see below, a more 
precise analysis is possible for certain type of environments when d = 1. 

Consider the chain graph G = (V, E) with V = {1, . . . , n} and E = {(i,j); \i — j\ < 1}. 
A random conductance model on this graph can be obtained by defining K with ([1]) by 
putting i.i.d. positive weights U of law C on the edges. For instance, if we remove the 
loops, this corresponds to define K by Q with a\ = ■ ■ ■ = a n = 0, b± = c n = 1, and, for 
every 2 < i < n — 1, 

bi = 1 — Ci = Vi = — — . 

where (Ui t i + i)i>i are i.i.d. random variables of law C supported in (0,oo). The random 
variables V\, . . . , V n are dependent here. 

Let us consider now an alternative simple way to make K random. Namely, we use 
a sequence (l^)i>i of i.i.d. random variables on [0, 1] with common law C and define the 
random birth-and-death Markov kernel K by ([9]) with 

bi = c n = 1 and 6j = 1 — Ci = Vi for every 2 < i < n — 1. 

In other words, the random Markov kernel K is of the form 

/ 1 \ 

1 - v 2 v 2 

i - v 3 o y 3 

(10) K 

1 - K_i V n -! 

\ 1 / 

This is not a random conductance model. However, the kernel is a particular case of the 
one appearing in Theorem 11.61 corresponding to the i.i.d. environment given by 

p(») = (l-Vi,0,Vi) 
for every i > 1. This gives the following corollary of Theorem 11.61 

Corollary 1.7 (Global behavior for i.i.d. environment). Let K be the random birth-and- 
death Markov kernel (|10p where (K)i>2 o^e i.i.d. of law C on [0,1]. Then there exists a 
non-random probability distribution /x on [—1, +1] such that almost surely, 

lim WpifiK,^) = 

n— >oo 

for every p > 1, where W p is the Wasserstein distance as in ([6]). The limiting spectral 
distribution \i is fully characterized by its sequence of moments, given for every k > 1 by 

/ x 2k - x 



-i 



fi(dx) = and j x 2k ii(dx) = ^ H E (v N ^ (1 - V) N ^- l) 
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where V is a random variable of law C and where 

D k = {7 = (70, • • • , 72fe) : 70 = 72fc = 0, and \ j e - = 1 for every < £ < 2k - 1} 

is the set of loop paths of length 2k of the simple random walk on Z, and 

2k- 1 

= Yl 1 {(li,li+l)=(v+l)} 

e=o 

is the number of times 7 crosses the horizontal line y = i + ^ in the increasing direction. 

When the random variables (Vi)i>2 are only stationary and ergodic, Corollary 11.71 re- 
mains valid provided that we adapt the formula for the even moments of \x (that is, move 
the product inside the expectation). 

Remark 1.8 (From Dirac masses to arc-sine laws). Corollary 11.71 gives a formula for the 
moments of fj,. This formula is a series involving the "Beta-moments" of C We cannot 
compute it explicitly for arbitrary laws C on [0,1]. However, in the deterministic case 
C = 5\/2i we have, for every integer k > 1, 

J +1 x 2k ^dx) = 2" *7«-i) = 2- 2fc (^) = J +1 x 2k -jtL= 

which confirms the known fact that \x is the arc-sine law on [—1, +1] in this case (see e.g. 
|20t III. 4 page 80]). More generally, a very similar computation reveals that if C = 6 P 



with < p < 1 then fi is the arc-sine law on [—2y/p(l — p) , +2y / p(l — p)]. Figures! 
display simulations illustrating Corollary 11.71 for various other choices of C 

Remark 1.9 (Non-universality). The law n in Corollary 11.71 is not universal, in the sense 
that it depends on many "Beta-moments" of £, in contrast with the complete graph case 
where the limiting spectral distribution depends on C only via its first two moments. 

We now turn to the edge behavior of o~(K) where K is as in (|10p . Since K has period 
2, one has \ n {K) = — 1 and we are interested in the behavior of \2{K) = —X n -i(K) as 
n goes to infinity. Since the limiting spectral distribution fj, is symmetric, the convex hull 
of its support is of the form [— a^, +a^] for some < ota < 1. The following result gives 
information on a„. The reader may forge many conjectures in the same spirit for the map 
£ 1— » (j, from the simulations given by Figures [T][2][3j 

Theorem 1.10 (Edge behavior for i.i.d. environment). Let K be the random birth-and- 
death Markov kernel ()10p where (V^)j>2 are i.i.d. of law C on [0, 1]. Let [i be the symmetric 
limiting spectral distribution on [— 1,+1] which appears in Corollary \1.7[ Let [— a^, +a^\ 
be the convex hull of the support of fj,. If C has a positive density at 1/2 then = 1. 
Consequently, almost surely, 

X 2 (K) = -Xn^(K) = l + o(l). 

On the other hand, if C is supported on [0, t] with < t < 1/2 or on [t, 1] with 1/2 < t < 1 
then almost surely limsup n _ s>00 \2(K) < 1 and therefore < 1. 

The proof of Theorem ll.lOl is given in Section [3l The speed of convergence of X 2 (K) — 1 
to is highly dependent on the choice of the law C. As an example, if e.g. 

2" 



E 



log 



1 - V 



and E 



log 



l-V 



> 



where V has law C, then K is the so called Sinai random walk on {1, . . . , n}. In this case, 
by a slight modification of the analysis of [14] , one can prove that almost surely, 

-00 < liminf — = log(l - A 2 (-fC)) < limsup — = log(l - A 2 (iO) < °- 
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Thus, the convergence to the edge here occurs exponentially fast in y/n. On the other 
hand, if for instance C = (simple reflected random walk on {1, . . . , n}) then it is 
known that 1 — X%(K) decays as n~ 2 only. 

We conclude with a list of remarks and open problems. 

Fluctuations at the edge. An interesting problem concerns the fluctuations of \2{\/nK) 
around its limiting value 2a in the complete graph model. Under suitable moments condi- 
tions on C, one may seek for a deterministic sequence (a n ), and a probability distribution 
Ponl such that 

(11) a n (X 2 (V^K) -2a) A V 

where "-4" stands for convergence in distribution. The same may be asked for the random 
variable A„(- v /nA') + 2a. Computer simulations suggest that a n n 2 / 3 and that V is close 
to a Tracy- Widom distribution. The heuristics here is that \2{y/nK) behaves like the Ai 
of a centered Gaussian random symmetric matrix. The difficulty is that the entries of K 
are not i.i.d., not centered, and of course not Gaussian. 

Symmetric Markov generators. Rather than considering the random walk with in- 
finitesimal generator K — I on the complete graph as we did, one may start with the 
symmetric infinitesimal generator G defined by Gi j = Gji = U{j for every 1 < i < j < n 
and Gi t i = — YljJ=i f° r every 1 < i < n. Here (C/ij)i<i<y is a triangular array of i.i.d. 
real random variables of law C. For this model, the uniform probability measure tl is re- 
versible and invariant. The bulk behavior of such random matrices has been investigated 
in [IE]. 

Non reversible Markov ensembles. A non-reversible model is obtained when the 
underlying complete graph is oriented. That is each vertex i has now (besides the loop) 
n — 1 outgoing edges and n — 1 incoming edges (j, i). On each of these edges we place 
an independent positive weight Vij with law £, and on each loop an independent positive 
weight Vi i with law C This gives us a non-reversible stochastic matrix 

x ■ = . 

h ELl V i,k 

The spectrum of K is now complex. If C is exponential, then the matrix K describes the 
Dirichlet Markov Ensemble considered in [17]. Numerical simulations suggest that if C 
has, say, finite positive variance, then the ESD of n 1 / 2 K converges weakly as n — > oo to the 
uniform law on the unit disc of the complex plane (circular law). At the time of writing, 
this conjecture is still open. Note that the ESD of the i.i.d. matrix (n -1 / 2 Vij)i<jj< n is 
known to converge weakly to the circular law; see [35] and references therein. 

Heavy— tailed weights. Recently, remarkable work has been devoted to the spectral 
analysis of large dimensional symmetric random matrices with heavy-tailed i.i.d. entries, 
see e.g. [331 El HI [381 E]- Similarly, on the complete graph, one may consider the bulk 
and edge behavior of the random reversible Markov kernels constructed by ([1]) when the 
law C of the weights is heavy-tailed (i.e. with at least an infinite second moment). In 
that case, and in contrast with Theorem 11.11 the scaling is not yjn and the limiting 
spectral distribution is not Wigner's semi-circle law. We study such heavy-tailed models 
elsewhere [12]. Another interesting model is the so called trap model which corresponds 
to put heavy-tailed weights only on the diagonal of U (holding times), see e.g. [13] for 
some recent advances. 
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2. Proofs for the complete graph model 

Here we prove Theorems 11.11 [L2l Proposition ll.5l and Corollarv ll.31 In the whole sequel, 
we denote by L 2 (l) the Hilbert space R n equipped with the scalar product 



(x,y) = 



X% Di- 



8=1 



The following simple lemma allows us to work with symmetric matrices when needed. 

Lemma 2.1 (Spectral equivalence). Almost surely, for large enough n, the spectrum of 
the reversible Markov matrix K coincides with the spectrum of the symmetric matrix S 
defined by 

pi V Ui,3 



Q. ■ — —K 

l >3 — \ I h3 



Pj ' ^plpj 

Moreover, the corresponding eigenspaces dimensions also coincide. 

Proof. Almost surely, for large enough n, all the pi are positive and K is self-adjoint as an 
operator from L 2 (p) to L 2 (p), where L 2 (p) denotes W 1 equipped with the scalar product 

n 

(x,y) p = ^PiXiVi- 
i=i 

It suffices to observe that a.s. for large enough n, the map i^i defined by 

is an isometry from L 2 (p) to L 2 (l) and that for any x,y £ IR n and 1 < i < n, we have 

n 

(Kx)i = ^ K i,jXj 
3=1 

and 

n n n 

i,j=l i,j=l i,j=l 

□ 

The random symmetric matrix S has non-centered, non-independent entries. Each 
entry of S is bounded and belongs to the interval [0, 1], since for every 1 < i,j < n, we 
have Sij < U^j/wUijUj^ = 1. In the sequel, for any n x n real symmetric matrix A, we 
denote by 

Xn(A) < ■ ■ ■ < X^A) 

its ordered spectrum. We shall also denote by ||^4|| the operator norm of A, defined by 

ii a n2 ^-^4x ? yix\ 

\\A\\ = max — ; : — . 

xeR" (x,x) 

Clearly, \\A\\ = max(Ai(^4), — A n (^4)). To prove Theorem 11.11 we shall compare the sym- 
metric random matrix y/nS with the symmetric n x n random matrices 

(12) - d ^ = 1- 



Note that W defines a so called Wigner matrix, i.e. W is symmetric and it has centered 
i.i.d. entries with finite positive variance. We shall also need the non-centered matrix W. 
It is well known that under the sole assumption a 2 G (0, oo) on C, almost surely, 



p w — > and p^, — y W 2t7 
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where pw and [Myf are the ESD of W and W, see e.g. [5J Theorems 2.1 and 2.12]. Note 

that W is a rank one perturbation of W, which implies that the spectra of W and W are 
interlaced (Weyl-Poincare inequalities, see e.g. [MIE]). Moreover, under the assumption 
of finite fourth moment on C, it is known that almost surely 

A„(W) -»■ -2a and Xi(W) +2a. 

In particular, almost surely, 

(13) \\W\\ = 2a + o(l). 

On the other hand, and still under the finite fourth moment assumption, almost surely, 

Xl(W) +oo while \ 2 {W) -> +2a and A„(W) -2a 

see e.g. [4"1 121] 15]. Heuristically, when n is large, the law of large numbers implies that pi 
is close to n (recall that here C has mean 1), and thus y/nS is close to W. The main tools 
needed for a comparison of the matrix y/nS with W are given in the following subsection. 

Uniform law of large numbers. We shall need the following Kolmogorov-Marcinkiewicz- 
Zygmund strong uniform law of large numbers, related to Baum-Katz type theorems. 

Lemma 2.2. Let (-Atj)tj>i ^ e a symmetric array of i.i.d. random variables. For any reals 
a > 1/2, b>0, andM > 0, i/E(|yli i i|( 1 + b )/ a ) < oo then 



max 

Ki<Mn b 



o(n a ) a.s. where c 



E(A M ) ifa<l 
any number if a > 1. 



Proof. This result is proved in O Lemma 2] for a non-symmetric array. The symmetry 
makes the random variables (Sj=i -^i,i)»>i dependent, but a careful analysis of the argu- 
ment shows that this is not a problem except for a sort of converse, see Lemma 2] for 
details. □ 

Lemma 2.3. If C has finite moment of order k € [1, 2] then 

Pi 



(14) max 



1 

n 



o 1 



almost surely, and in particular, if C has finite second moment, then almost surely 

(15) max — — 1 = o(l). 

l<i<n n 

Moreover if C has finite moment of order k with 2 < k < 4, then almost surely 

(16) max — — 1 = o(n « ). 

l<i<n n 

Additionally, if C has finite fourth moment, then almost surely 

(17) £(£-l) a = 0(l). 

i=i 

Proof. The result (j!4|) follows from Lemma 12.21 with 

A id = Uij, a = M = 1, b = k - 1. 

We recover the standard strong law of large numbers with k = 1. The result (|16p - and 
therefore (|15p setting « = 2 - follows from Lemma 12.21 with this time 

Aij = Uij, a = 2/k, b = M = 1. 
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Proof of (|17p . We set ej = n 1 pi — 1 for every 1 < i < n. Since C has finite fourth 
moment, the result ()13p for the centered Wigner matrix W defined by (|12p gives that 

= ^TTW 1 - llwf = 4 " 2 + = 0(1) 

i=l \ > / 

almost surely. □ 
We are now able to give a proof of Proposition 11.51 



Proof of Proposition \1.5l Since C has finite first moment, by the strong law of large num- 
bers, 

n 

pl + ---+p n = J2 Ui > i + 2 Yl U id = n 2 (l + o(l)) 

i=l l<*<i<i 

almost surely. For every fixed % > 1, we have also pi = n(l + o(l)) almost surely. As a 
consequence, for every fixed i > 1, almost surely, 

ft- » = ^±^ = V <■(!)). 
Pi H hpn n z (l + o(l)J n 

Moreover, since C has finite second moment, the o(l) in the right hand side above is 
uniform over 1 < i < n thanks to (|15|) of Lemma 12.31 This achieves the proof. □ 

Note that, under the second moment assumption, pi = n _1 (l + 0(5)) for 1 < i < n, 
where 

(18) 5 := max |ej| = o(l) , with ej := n~ l pi — 1. 

l<i<n 

We will repeatedly use the notation ()18p in the sequel. 

Bulk behavior. Lemma 12.11 reduces Theorem 11.11 to the study of the ESD of y/nS, a 
symmetric matrix with non independent entries. One can find in the literature many 
extensions of Wigner's theorem to symmetric matrices with non-i.i.d. entries. However, 
none of these results seems to apply here directly. 

Proof of Theorem \l.l[ We first recall a standard fact about comparison of spectral densi- 
ties of symmetric matrices. Let L(F, G) denote the Levy distance between two cumulative 
distribution functions F and G on R, defined by 

L(F, G) = inf{e > such that F(- — e) — e < G < F(- + e) + e)} . 

It is well known [7] that the Levy distance is a metric for weak convergence of probability 
distributions on 1R. If Fa and Fb are the cumulative distribution functions of the empirical 
spectral distributions of two hermitian n x n matrices A and B, we have the following 
bound for the third power of L(Fa,Fb) in terms of the trace of (A — B) 2 : 

1 1 n 

(19) L 3 (F A , F B ) < —Tr((A - B) 2 ) = - V (A i:j - B hJ ) 2 . 

n n 

The proof of this estimate is a consequence of the Hoffman- Wielandt inequality |23| , see 
also [3_, Lemma 2.3]. By Lemma 12.11 we have y/n\k(K) = Ak(\/nS) for every 1 < k < n. 
We shall use the bound (|19p for the matrices A = y/nS and B = W, where W is defined 
in (11211. We will show that a.s. 



(20) - ^ (Aij - B^) 2 = 0(5 



n . 



where 5 = maxj |ej| as in (|18|) . Since C has finite positive variance, we know that the ESD 
of B tends weakly as n — > oo to the semi-circle law on [—2a, +2cr]. Therefore the bound 
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(|20p . with (|19p and the fact that 5 — > as n — > oo is sufficient to prove the theorem. We 
turn to a proof of (|20p . For every 1 < i, j < n, we have 



i 



Set, as usual p% = n(l + e^) and define ipi = (1 + ej) 2 — 1. Note that by Lemma f2,3l 
almost surely, -0* = 0(5) uniformly in i = 1, . . . , n. Also, 
n 



1 = (1 + ^{1 + ipj) - 1 = + Vj + A^j ■ 



In particular, ^7== — 1 = 0(5). Therefore 




By the strong law of large numbers, Sij=i — ^ + 1 a.s., which implies (p0|) . □ 
Edge behavior. We turn to the proof of Theorem 1 1 . 2 1 which concerns the edge of a(\/nS). 

Proof of Theorem ] 1. Si Thanks to Lemma 12. II and the global behavior proven in Theorem 
ll-H it is enough to show that, almost surely, 

limsup y/nmax(\\2(S)\, |A n (S)|) < 2a . 

n— >oo 

Since K is almost surely irreducible for large enough n, the eigenspace of S of the eigen- 
value 1 is almost surely of dimension 1, and is given by R(y^oT, . . . , -Jp~^)- Let P be the 
orthogonal projector on Ry/p. The matrix P is n x n symmetric of rank 1, and for every 
1 < i, j < n, 

p _ y/NPj 
r hi — sr^n ■ 

The spectrum of the symmetric matrix S — P is 

{A n (5),...,A 2 (5)}U{0}. 

By subtracting P from S we remove the largest eigenvalue 1 from the spectrum, without 
touching the remaining eigenvalues. Let V be the random set of vectors of unit Euclidean 
norm which are orthogonal to ^fp for the scalar product (•, •) of R n . We have then 

V / nmax(|A2(5')|, |A n (5)|) = max \l\fnSv,v)\ = max.\(Av,v)\ 
where A is the n x n random symmetric matrix defined by 

Au = Ms - Pk, = (-^L - ■ 

\y/PiPj Ek=lPkJ 

In Lemma 12.41 below we establish that almost surely (v, (A — W)v) = 0(5) + 0(n~ l l 2 ) 
uniformly in v £ V, where W is defined in (|12p and 5 is given by (|18|) . Thus, using (|13j) . 

\(Wv,v}\ <max(|A 1 (W)|,|A n (W)|)=2a + (l), 

we obtain that almost surely, uniformly in v £ V, 

\(Av,v)\ < \(Wv,v)\ + \((A-W)v,v)\ = 2a + o(l) + 0(6). 

Thanks to Lemma 12.31 we know that 5 = o(l) and the theorem follows. □ 

Lemma 2.4. Almost surely, uniformly in v 6 V, we have, with 5 := maxj |ej|, 

(v, {A - W)v) = 0(5) + 0(n-^ 2 ). 
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Proof. We start by rewriting the matrix 

A 



n U id y/n^pjp] 



1,1 VP^pj Efc pk 

by expanding around the law of large numbers. We set pi = n(l + £j) and we define 

1 



tpi = vT+li - 1 and rpi = - 1. 

Vl + £j 

Observe that and V'i are of order and by Lemma 12.31 °f- (H3) we have a.s. 
(21) <^)=J>f = 0(l) and <V^)=J>? = 0(1). 

i i 

We expand 

^Tp-p- = n (l + ei)2(l + ej)2 = ra(l + <^)(1 + . 

Similarly, we have 
Moreover, writing 



rT x {l + i> i ){l + i> j ). 



y/PiPj 

k=l \ k / 



and setting 7 := (1 + - ^ fc e^) 1 — 1 we see that 



\fc=i / 

Note that 7 = 0(<5). Using these expansions we obtain 

4= I = 4^(1 + ^)(1 + ^) 

y/PiPj V n 

and 

2 S^ = -4(l + W)(l + w)(l + 7). 

l^k Pk V n 

Prom these expressions, with the definitions 

= <Pi + <Pj + fifj and = vpi + ipj + 

we obtain 

Aij = Widi 1 + + -7= [*<J - ^jC 1 + 7) + 7] • 



Therefore, we have 

(v, - = - V ViWij^jVj + {v, Shi) - (v, Vv) - 1= (v, l) 2 . 

Let us first show that 

(22) (v,l)=0(l). 
Indeed, v S V implies that for any c E R, 

= (v, 1-Cy/p). 

Taking c = 1/y/n we see that 

1 - <V/)i = 1 - Vl + e% = -<Pi- 
Thus, Cauchy-Schwarz' inequality implies 

(v, l) 2 < (v,v)(ip,tp) 
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and (|22p follows from (|21[) above. Next, we show that 

(23) =0(1). 
Note that 

= 2<u,l)(u,¥?) + {v,ip) 2 . 
Since {v,ip) 2 < (v,v)(tp,tp) we see that (|23|) follows from (f2Tj) and (f22]) . In the same way 
we obtain that (v, ^fv) = 0(1). So far we have obtained the estimate 

(24) (v, (W - A)v) = - J^ViWijVijVj + Oin- 1 ' 2 ). 

To bound the first term above we observe that 

ViWij^ijVj = 2 ^2 tpiVi(Wv)i + ^2 ^iV-iWijipjVj 

= 2{ip,Wv) + {tp,Wi!) , 
where i]} denotes the vector ipi := ipiVi. Note that 

= £>,\ 2 < 0(5 2 )(v,v) = 0(5 2 ). 

i 

Therefore, by definition of the norm ||W|| 

\(i>,W^)\ < ^(ip,^)^(W^,W^) < \\W\\ (ip,ip) = 0(5 2 ) \\W\\ . 
Similarly, we have 

\(4>,Wv)\ < y/{i>,i>)y/{Wv,Wv) < 0(5) \\W\\y/{v,v) = 0(5) \\W\\ . 
From (|13fl . ||W|| = 2a + o(l) = 0(1). Therefore, going back to (|24|) we have obtained 

(v, (W - A)v) = 0(5) + 0(n- 1 ' 2 ). 

□ 

We end this section with the proof of Corollary 11.31 

Proof of Corollary By Theorem 11.21 almost surely, and for any compact subset C of 
K, containing strictly [0,2a], the law Jl^x ^ s supported in C for large enough n. On 
the other hand, since [i^k = (1 — n_1 )/Vnif + n_1( Vn! we S e ^ from Theorem 11.11 that 
almost surely, Jj-^k tends weakly to as n — > oo. Now, for sequences of probability 
measures supported in a common compact set, by Weierstrass' theorem, weak convergence 
is equivalent to Wasserstein convergence W p for every p > 1. Consequently, almost surely, 

(25) lim W p (fl VTiK ,W2a) = 0. 

for every p > 1. It remains to study W p (fj,^ K ,W2a)- Recall that if v\ and v 2 are 
two probability measures on H with cumulative distribution functions F Ul and F V2 with 
respective generalized inverses -F" 1 and F~\ then, for every real p > 1, we have, according 
to e.g. [361 Remark 2.19 (ii)], 

(26) W p (^,u 2 r= [ 1 \F- 1 1 (t)-F- 2 1 (t)\ p dt. 

Jo 

Let us take vi = = (1 — ti -1 )/^^^ + n~ 1 5^i and v 2 = W 2a . Theorem 11.21 gives 

M(\/nK) < oo a.s. Also, a.s., for large enough n, and for every t G (0, 1), 

Fu'it) = F^ K (t) = v^l[i-„-i,i)(t) + F~^ K (t + n- 1 )l (0jl _ n - 1) (t). 

The desired result follows then by plugging this identity in ([26]) and by using ([25]) . □ 
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3. Proofs for the chain graph model 

In this section we prove the bulk results in Theorem 11.61 and Corollary 11.71 and the edge 
results in Theorem 11.101 

Bulk behavior. 

Proof of Theorem \1.6\ Since [ik is supported in the compact set [—1, +1] which does not 
depend on n, Weierstrass' theorem implies that the weak convergence of [ik as n — > oo is 
equivalent to the convergence of all moments, and is also equivalent to the convergence in 
Wasserstein distance W p for every p > 1. Thus, it suffices to show that a.s. for any £ > 0, 
the £ th moment of hk converges to E[r^(0)] as n — > oo. The sequence (E[r^(0)])^>o will 
be then necessarily the sequence of moments of a probability measure [i on [—1, +1] which 
is the unique adherence value of [Ik as n — > oo. 

For any £ > and i > 1 let rf' n (i) be the probability of return to i after £ steps 
for the random walk on {l,...,n} with kernel K. Clearly, rf' n (i) = rf(i) whenever 
l+£<i<n — £. Therefore, for every fixed £, the ergodic theorem implies that almost 
surely, 

_. n 1 n 

lim - y r f' n (i)= lim -Yrf(i) = E[rf (0) ]. 

i=l i=l 

This ends the proof. □ 



Proof of Corollary \1. 7\ The desired convergence follows immediately from Theorem 11.61 
with p(i) = (1 — Vi, 0, Vi) for every i > 1. The expression of the moments of fi follows from 
a straightforward path-counting argument for the return probabilities of a one-dimensional 
random walk. □ 

Let us mention that the proof of Corollary 11.71 could have been obtained via the trace- 
moment method for symmetric tridiagonal matrices. Indeed, an analog of Lemma 12.11 
allows one to replace K by a symmetric tridiagonal matrix S. Although the entries of S 
are not independent, the desired result follows from a variant of the proof used by Popescu 
for symmetric tridiagonal matrices with independent entries [301 Theorem 2.8]. We omit 
the details. 

Remark 3.1 (Computation of the moments of \i for Beta environments). As noticed in 
Remark[L8j the limiting spectral distribution [x is the arc-sine law when C = b~i/2- Assume 
now that C is uniform on [0, 1]. Then for every integers m > and n > 0, 



E(y m (l - V) n ) = f u m (l - u) n du = Beta(n + 1, m + 1) 
Jo 



T(n+ l)r(m + 1) 
r(n + m + 2) 



which gives 



E(V m (l-V) n ) 



nlml 



(n + m + 1)! (n + m + l)( n + m )' 

The law of ( n ^ l m )V rm (l — V) n is the law of the probability of having m success in n + m 
tosses of a coin with a probability of success p uniformly distributed in [0,1]. Similar 
formulas may be obtained when £ is a Beta law Beta(a,/3). 

Edge behavior. 

Proof of Theorem \l.l(K Proof of the first statement. It is enough to show that for every 
< a < 1, there exists an integer k a such that for all k> k a , 

r+l 

(27) / x 2k fi(dx) > a 2k . 
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By assumption, there exists C > and < to < 1/2 such that for all < t < to, 

P(V £ [1/2 — t, 1/2 + 1]) >Ct 
where V is random variable of law C In particular, for all < t < to, 



E 



, x JV 7 (t)+JV 7 (i-l) 

> Ct I - - i 



and, if 
then 



71 



+i 



max{i > : max(JV 7 (i), iV 7 (-i)) > 1} 

iV 7 (i)+iV 7 (i-l) 



7 e£> fe iez 

> £ (co 211 ^ 1 

1 



1 \Ei^(')+^(i-l) 

2 * 



> 



2 A- 



£ {Ctf H[ 



2k 



where -Dfc jQ . = {7 G ^fe : ||t ||oo < Now, from the Brownian Bridge version of 

Donsker's Theorem (see e.g. [26J and references therein), for all a > 1/2, 

\Dh<x\ 



lim 



1. 



k— >oo \Dj e \ 

Since = Card(D fc ) = ( 2k ) , Stirling's formula gives \D k \ ~ ^'(vrA;) -1 / 2 , and thus 
1 

x 2k n(dx) > (nk)- 1 / 2 ^ - 2t) 2k {Ct) 2ka {\ + o(l)). 



We then deduce the desired result ([27)) by taking t small enough such that 1 — 2t > a and 
1/2 < a < 1. This achieves the proof of the first statement. 

Proof of the second statement. One can observe that if C = 5 P for some p G (0, 1) 
with p 7^ 1/2, an explicit computation of the spectrum will provide the desired result, in 
accordance with Remark II .81 For the general case, we get from [28], for any 2 < k < n — 1, 

1 - A 2 (isr) > 1 



4max(5+ 5*) 



where 
5+ 



max 

i>k 



E 




and 2?,. 



max 

i<k 



'k-1 



with the convention Vi = 1 — V n = 1. Here we have fixed the value of n and p is any 
invariant (reversible) measure for K. It is convenient to take p\ = 1 and for every 2 < i < n 

= V 2 --- V l ~i 

Pl {i-v 2 )---{\-v i y 

By symmetry, it suffices to consider the case where C is supported in [0, t] with < t < 1/2. 
Let us take k = 2. In this case, = 1, and the desired result will follow if we show that 
£?2 is bounded above by a constant independent of n. To this end, we remark first that for 
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any t > j we have p e = pj Y[ m lj{V m /(l - V m+ x)). Therefore, setting e 7 = t/(l - t) < 1, 
we have pi < pje^ 1 ^^ . It follows that, for any k < i, 



i=fc+l^>i F:,v j; j=fe+l 

fi_ e -7)-2 i_f 



< 



1-t (l-2t) 2 " 

In particular, < (1 — £)/(l — 2t) 2 , which concludes the proof. □ 
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Figure 1. Plots illustrating Corollary 1 1.71 Each histogram corresponds to 
the spectrum of a single realization of K with n = 5000, for various choices 
of C. From left to right C is the uniform law on [0, t] U [1 — t, 1] for t = 1/8, 
I - 1/4. 
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Figure 2. Plots illustrating Corollary 11.71 and the second statement of 
Theorem 11.101 Each histogram corresponds to the spectrum of a single 
realization of K with n = 5000, for various choices of C. From left to right 
and top to bottom, C is uniform on [0,t] with t = 1/8, t = 1/4, t = 1/2, 
and t = 1. 




Figure 3. Plots illustrating Corollary 1 1.71 Each histogram corresponds to 
the spectrum of a single realization of K with n = 5000, for various choices 
of C From left to right and top to bottom, C is uniform on [t, 1 — t] with 
t = 0, t = 1/8, t = 1/4, t = 1/2. The last case corresponds to the arc-sine 
limiting spectral distribution mentioned in Remark 11.81 
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